Guiding Practical Text Classification Framework to Optimal State in Multiple Domains

نویسندگان

  • Sung-Pil Choi
  • Sung-Hyon Myaeng
  • Hyun-Yang Cho
چکیده

This paper introduces DICE, a Domain-Independent text Classification Engine. DICE is robust, efficient, and domain-independent in terms of software and architecture. Each module of the system is clearly modularized and encapsulated for extensibility. The clear modular architecture allows for simple and continuous verification and facilitates changes in multiple cycles, even after its major development period is complete. Those who want to make use of DICE can easily implement their ideas on this test bed and optimize it for a particular domain by simply adjusting the configuration file. Unlike other publically available tool kits or development environments targeted at general purpose classification models, DICE specializes in text classification with a number of useful functions specific to it. This paper focuses on the ways to locate the optimal states of a practical text classification framework by using various adaptation methods provided by the system such as feature selection, lemmatization, and classification models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OPTIMAL LOT-SIZING DECISIONS WITH INTEGRATED PURCHASING, MANUFACTURING AND ASSEMBLING FOR REMANUFACTURING SYSTEMS

This work applies fuzzy sets to the integration of purchasing, manufacturing and assembling of production planning decisions with multiple suppliers, multiple components and multiple machines in remanufacturing systems. The developed fuzzy multi-objective linear programming model (FMOLP) simultaneously minimizes total costs, total $text{CO}_2$ emissions and total lead time with reference to cus...

متن کامل

Optimal Strategy of State Lands allocation in Islamic Economics: Game Theory Approach

In the Islamic legal and economic system, a precise mechanism for land use is defined. The classification of lands, along with the flexible methods available to the government for the allocation of land, raises the question of what is the most desirable method for state lands allocation based on the Islamic legal and economic system? Accordingly, the purpose of the present study is to find a fa...

متن کامل

Multinomial Adversarial Networks for Multi-Domain Text Classification

Many text classification tasks are known to be highly domain-dependent. Unfortunately, the availability of training data can vary drastically across domains. Worse still, for some domains there may not be any annotated data at all. In this work, we propose a multinomial adversarial network (MAN) to tackle the text classification problem in this real-world multidomain setting (MDTC). We provide ...

متن کامل

A High-Performance Model based on Ensembles for Twitter Sentiment Classification

Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...

متن کامل

Multi-Transfer: Transfer Learning with Multiple Views and Multiple Sources

Transfer learning, which aims to help the learning task in a target domain by leveraging knowledge from auxiliary domains, has been demonstrated to be effective in different applications, e.g., text mining, sentiment analysis, etc. In addition, in many real-world applications, auxiliary data are described from multiple perspectives and usually carried by multiple sources. For example, to help c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TIIS

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2009